Skip to content
This repository has been archived by the owner on May 7, 2021. It is now read-only.

ore: Introspect disk for size rather than hardcode 8GB #944

Merged
merged 1 commit into from
Nov 16, 2018

Conversation

cgwalters
Copy link
Member

We want to expand the disk size of Red Hat CoreOS. Use the JSON
output of qemu-img info to get the size of the VMDK, and pass
that on to AWS.

The hardcoded 8GB moves to Plume. I think to support larger disks
there then we'd need to query the snapshot or so?

if err != nil {
return 0, err
}
return uint32(info.VirtualSize / (uint64(1024) * 1024 * 1024)), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will round down, is that intended?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but right now the disk images we're providing are exact multiples of GiB. We could make it a fatal error to have it not be exact?

I am not clear what the semantics are here - presumably if it wasn't a GiB multiple one would need to round up?

Although now that I actually look at the docs I notice, yeah we'd need to round up but also:

Default: If you're creating the volume from a snapshot and don't specify a volume size, the default is the snapshot size.

Which makes sense...so why are we specifying it in the first place? A little git blame later:

f0f298a

"The default if nothing is set is 5GB" is confusing - were CL disk images 5GB at that time and he wanted AWS to be larger?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CL images are 4.5 GB right now, so this is (a) broken, and (b) a behavior change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Well that explains the "5GB" bit probably. But then where did 8GB come from?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We create 8 GB images in AWS, so that the root FS has a reasonable amount of free space.

For compatibility, maybe use max(8 GB, actual image size)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We create 8 GB images in AWS, so that the root FS has a reasonable amount of free space.

Only for AWS? Like Azure and other clouds are still 5GB?

For compatibility, maybe use max(8 GB, actual image size)?

Hmm, if I do that unconditionally then this entire PR would be pointless since I want to grow RHCOS beyond 8GB, right?

It seems like we'd need a new --im-not-cl flag? Or maybe we only clamp if amiName.contains("Container-Linux") ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was confused...yeah, we can do max.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFW one finds golang/go#20775

Oh well, these are small integers, we can be confident we're not going to hit FP accuracy issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure images emitted by our build system are 30 GB and GCE images are 8.5 GB. We're not consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK another updated version, still not tested beyond compilation though ⬇️

if err != nil {
return 0, err
}
return uint32(info.VirtualSize / (uint64(1024) * 1024 * 1024)), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CL images are 4.5 GB right now, so this is (a) broken, and (b) a behavior change.

cmd/ore/aws/upload.go Show resolved Hide resolved
cmd/ore/aws/upload.go Outdated Show resolved Hide resolved
@cgwalters
Copy link
Member Author

Updated for comments, also tried to consistently use GiB here since it's important.

I didn't test it though beyond compilation yet.

@cgwalters cgwalters force-pushed the 8gb-isnt-enough-for-everyone branch 2 times, most recently from 8861d98 to 0fc6af8 Compare November 8, 2018 23:27
@cgwalters
Copy link
Member Author

FWIW we're currently carrying this patch for RHCOS which also implies we have a forked coreos-assembler which is going to be annoying to maintain; I can retest it, any other code comments?

Copy link
Contributor

@bgilbert bgilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM pending testing.

@@ -36,6 +36,13 @@ var (

type EC2ImageType string

// In GiB. This used to be hardcoded in ore. See discussion in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was hardcoded in this file, actually. I'm still in favor of dropping that sentence, but 🤷‍♂️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to tweak the comment ⬇️

We want to expand the disk size of Red Hat CoreOS.  Use the JSON
output of `qemu-img info` to get the size of the VMDK, and pass
that on to AWS.

The hardcoded 8GB moves to Plume.  I think to support larger disks
there then we'd need to query the snapshot or so?
@cgwalters
Copy link
Member Author

Also I did test this latest PR, but only with RHCOS, not CL yet. I think it'll work - if not we'll find out on the next nightly job right?

@bgilbert
Copy link
Contributor

We'll find out on the plume side if we check the uploaded image by hand. We don't run ore automatically. You could download an AWS image for testing ore.

@cgwalters
Copy link
Member Author

You could download an AWS image for testing ore.

I did that and uploaded it and it failed, and I didn't have time to debug it in depth. Turns out the problem is it's a raw image and not vmdk, which I found surprising. Converting it to vmdk before uploading works for me.

@bgilbert
Copy link
Contributor

Ah, yeah, for historical reasons there's a separate VMDK image, ami_vmdk.

Good to merge?

@cgwalters
Copy link
Member Author

good to merge

@bgilbert bgilbert merged commit 7ee5009 into coreos:master Nov 16, 2018
cgwalters added a commit to cgwalters/mantle that referenced this pull request Nov 29, 2018
We want to expand the disk size of Red Hat CoreOS; previously
there was a hardcoded 8GiB size for Container Linux which
(I believe entirely coincidentally) matched RHCOS.

See coreos#948
and coreos#944
cgwalters added a commit to cgwalters/mantle that referenced this pull request Nov 29, 2018
We want to expand the disk size of Red Hat CoreOS; previously
there was a hardcoded 8GiB size for Container Linux which
(I believe entirely coincidentally) matched RHCOS.

See coreos#948
and coreos#944
cgwalters added a commit to cgwalters/mantle that referenced this pull request Nov 29, 2018
We want to expand the disk size of Red Hat CoreOS; previously
there was a hardcoded 8GiB size for Container Linux which
(I believe entirely coincidentally) matched RHCOS.

See coreos#948
and coreos#944
cgwalters added a commit to cgwalters/mantle that referenced this pull request Nov 30, 2018
We want to expand the disk size of Red Hat CoreOS; previously
there was a hardcoded 8GiB size for Container Linux which
(I believe entirely coincidentally) matched RHCOS.

See coreos#948
and coreos#944
@cgwalters
Copy link
Member Author

This was reverted in #948

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants